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ABSTRACT 

Many researchers assume that unequal cell frequencies 
in analysis of variance (ANOVA) designs result from poor planning. 
However, there are several valid reasons why one might have to 
analyze an unequal-n data matrix. The present study reviewed four 
categories of methods for treating unequal-n matrices by ANOVA: (a) 
unaltered data (least-squares solution and unweighted means 
solution) ; (b) data substitution (grand mean method, cell mean 
method, Winer method, Snedecor-Cochran method) ; (c) data deletion, 
and (d) data clustering (unre plicated cell mean method, unreplicated 
random data clustering method, replicated random data clustering 
method) . The methods were compared empirically and theoretical 
problems with each were discussed. (Author) 
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(ABSTRACT) 

UNEQUAL CELL FREQUENCIES IN ANALYSIS OF VARIANCE: 
A REVIEW AND EXTENSION OF METHODOLOGY FOR MULTIPLE MISSING 

OBSERVATIONS 



Many researchers assume that unequal cell frequencies /in analysis 
of variance (ANOVA) designs result from poor planning. However, there are 
several valid reasons why one might have to analyze an Unequal-ji data matrix. 

The present study reviewed four categories of methods for treating unequal-n 

j 

matrices by ANOVA: (a) unaltered data (least-squares solution and unv/eighted 
means solution); (b) data substitution (grand mean method, cell mean method, 
Winer method, Snedecor-Cochran method); (c) data deletion, and (d) data 
clustering* (unrepl icated. eel I mean method, unreplicated random data cluster- 
ing method, replicated random data clustering method). The methods were 
compared empirically and theoretical problems with each were discussed. 
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(COMPLETE TEXT) 



UNEQUAL CELL FREQUENCIES IN 
ANALYSIS OF VARIANCE: A 
REVIEW AND EXTENSION OF MET KODO LOGY 
FOR MULTIPLE MISSING OBSERVATIONS 1 
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The majority of experimental studies in educational research that concern 
the analysis of variance (ANOVA) contain equal cell frequencies. Since most 
of these investigations are completed in tightly controlled university settings 
or laboratory situations, it is almost always possible to ensure that sufficient 
Ss are available to produce an equal-ri data matrix in a factorial ANOVA design. 
Thus, it is not surprising that most commonly used texts in educational sta- 
tistics discuss orvly the equal-ji, factorial ANOVA solution. Further, many 
applied statisticians take the attitude that a researcher has done poor pre- 
experiment planning if he allows himself to get into an unequal-j; ci rcumstance; 
one is even made to feel guilty about it! 
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Unfortunately, these equal-£ biases of the majority of educational 
statisticians do little for the researchers in large public school sit- • 
uations where unequal-n_ ANOVA problems are the rule rather than the ex- 
ception. Apart from possible lack of adequate planning for the experiment, 
what are some common reasons for unequal jn's to arise in the factorial 
design? One important reason is inherent dearth of some types of Ss; this 
consideration is especially prominent in the study of various handicapped 
populations. If one wapts to include such types of Ss in his study, he 
either must balance them with a like number (pitifully small) of other 
groups for his study, or he must settle for an unequal-n_ design. . A 
second reason might be inadvertent experimental mortality (attrition) 
over the course of the experiment, wher*> one would not for some reason 
have enough supplementary Ss to substitute for the missing ones in the 
data matrix. A third reason could be forced experimental mortality during 
the study when the investigator learns that some of his Ss who had pre- 
viously been identified as being appropriate to the study, really are not 
suitable; thus, rather than dispard the whole study, the experimenter an- 
alyzes his remaining* unequal -n^ matrix. However, whatever the reasons for 
attempting to analyze an unequal -n_ data mtrix, the range of methods for 
treating such matrices are relatively unfamiliar to most researchers. The- 
purpose of this paper is to survey existing methods of both common and out- 
of-the-way nature, as well as to introduce some previously unpublished 
techniques. 

PROCEDURE 

* ■'■ »" " » - ■ • 

Data: 

To facilitate discussion of the methods described herein and to provide 
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readers with a means of verifying the accuracy of their understanding of 
the analytical techniques, an empirical comparison of all procedures was 
undertaken by means of one master data matrix for a 3x3 design. Winer 
(1962) made an initial step in this direction when he used empirical com- 
parisons between least-squares and unweighted-means ANOVA; the present 
study extends the empirical comparison notion by also including 7 other 
unequal-^ techniques, as well as the original equal-n_ solution. Table I 
shows an equal-n^ matrix where the hypothetical investigator intended 15 

Insert Table I about here 

independent observations to be contained in each cell. The matrix reflects 
.a typical unequal-r^ si tuation often occurring in the remediation of mentally 
handicapped children where one applies treatments (Factor B). In particular, 
the hypothetical example assumes that 3 perceptual-motor training programs 
(the worst being Aj, A^average, and A^ best) were given to 3 levels of in- 
telligence (the range of Bj being 91-105; B 2 , 76-90; B^, 61-75). The cri^ 
terion is assumed to be the visual sequential memory subtest of the Illinois 
Test of Psychol inguisti c Abilities (ITPA), with a possible score range of 
0 to 41 • The 3s are assumed to be of chronological age 6 to 8 years. The 
data generation for this empirical simulation was aimed at producing quite 
strong main effects for factors A and B but quite negligible interaction 
between the two. Further, to achieve the common happening in which, regardless 
of mean differences among factorial levels, score ranges across cell categories 
often overlap to a certain extent, the ITPA scores were allowed to telescope 
as shown in Table 2. The degree of overlap is consistent across levels within 
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Insert Table 2 about here 

either factor* The individual scores in each cell of Table 1 were generated 
by 9 independent randomizations based upon the range limits set in Table 2 
(Rand Corporation , 1 955 ). The corplete 10/cell-n_ matrlxwas used only as a 
pivot' for discussion in comparing the several unequal-^ procedures* Each 
unequal-^ analysis was computed on the,data matrix that results from Table 1 
when the italicized entries were deleted. For the unequal-n_ matrix derived 
from Table I, one sees that the cell frequencies range from 10 to 15, with no 
proportionality among rows or columns assumed; that is, the unequal-n^ matrix 
in this study is the "worst 11 that could arise with respect to the orthogonality 
issue. 

Analyses : Since the majority of unequal-^ techniques are not available 
in programmed form, all computations were completed by electronic calculator, 
with systematic checking to ensure accuracy. A total of 10 unequa I -n_ proce- 
dures were compared in this study. A procedure is described at length only 
if it is not available elsewhere. The 10 methods can be grouped under four 
major headings. 

(1) UNALTERED DATA : The two unequal-^ techniques that fall under this 
heading are also the most widely known, used, and programmed approaches out 
of the 10 discussed in this paper. The two methods are known as least- 
squares analysis and unweighted-means analysis. As pointed out by Winer (1962), 
in cases where the levels of one factor are proportional to actual population 
strata so that irregular cell frequencies result naturally , then least- 
squares ANOVA is appropriate. However, if unequal frequencies in the resultant 
working sample are not related to" the population in a natural proportionality 
(that is, unequal cell frequencies might be the result of random attrition), 
then unweighted-means ANOVA is better suited to unequal cell frequencies. 



Proger S* 

Perhaps the best account of least-squares ANOVA Is given by Winer (1962, 
pp. 22^-227, 291-297). Other readable accounts can be found in Snedecor S 
Cochran (1967, pp. **77~ 83, 488-493) and in' Ferguson (I966, PP/ 319-323) . For 
those particularly interested in trend ANOVA, one should consult Gaito ( 1 965) , 
Black S Davis (1966), and Ferguson (I966, pp. 3^3-346). For further reading, 
see Kempthorne (1952, pp.80-8l), Rao (1952, pp. 96-98), Gourlay (1955), 
Snedecor (1956), Kenney & Keeping (195*0, Wi Ik & Kempthorne (1956), Brandt 
(1932), Strand & Jessen (19^3), Yates (193 2 *), Stevens (1948), and Federer & 
Zelen (1966). 

When ci rcumstances .behind an unaqual-n^ data matrix indicate that- un- 
weighted-means ANOVA is appropriate, one can refer to the- examples given in 
Winer (1962, pp. 103-10**, 222-22**, 2k]~2kk, 37^-378) and Snedecor & Cochran 
(1967, pp* **75"**77)* For further reading, see Gowen (1952). 

(2 ) DATA SUBSTITUTION: Four methods are worthy of consideration: (a) 
substitution of the grand mean, (b) substitution of the cell mean, (c) substi- 
tution al^ Winer, and (d) substitution ala Snedecor &' Cochran. All four pro- 
cedures have in common the attempt to add bits of .data to the original un- 
equal-^ matrix until it becomes, literally, an equal-n^ paradigm amenable to 
classical ANOVA. .The only modifications that must be made to the classical 
statistical machinery is, logically enough, to adjust the degrees of freedom 
for both within-cells variation and total variance. 

For the grand mean method, the mean of the entire unequal-n^ aiatrix is 
computed and substituted for each bit of missing data. For the cell mean 
method, wherever a cell has one or more missing values, the mean of that 
cell" is computed and substituted for each missing score within that cell. 
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The substitution method of Winer (1962, p. 281) was designed for 
situations in which an entire cell is missing! However, in most real- 
life unequal-n_ matrices, one almost always has some data within every cell. 
Thus, the logical extension of Winer's method in which one obtains row 
(and column) means of the cell means within the row (and column) that con- 
tains the missing cell, is to obtain comparable row (and column) means 
using every individual child's score (including the scores in the deficient 
cell). 

Further discussions of data substitution can be found in Cochran and 
Cox (1957, pp. 80, 110, 125, 227, 302, 400, 413, 450, 512), Healy and West- 
macott (1956), Lindquist (1953, p. 148), Afifi and Elashoff (1966), Lord 
(1955), Federer (1955, pp. 124-127, 133-134), and Bennett and Franklin (1954, 
PP. 382-383). Snedecor and Cochran (1967, .pp. 320-321) and Li (1964, pp. . 
231-236-237) present a very interesting iterative procedure for supplying 
two or more missing values in the data matrix. Basically, one cfiooses'any 
one of the two or more missing values, estimates a reasonable value, and makes 
the substitution. The other missing value is estimated with a least-squares 
formula as though there were only one value missing. Then one goes back and 
estimates the first value on the basis of the second one and so on, back 
and forth, until the values change only by very small amounts. Degrees of 
freedom are again adjusted for total sum of squares and error sum of squares 
after stabilization has occurred. For exact least-squares methods of data 
substitution, see Li (1964, pp. 227-243). Winer (I962, pp. 281-283) also pro- 
vides a method that minimizes the interaction effects. Another basic refer- 
ence with examples is Snedecor and Cochran (I967, pp. 317-321). Finally, 
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Examples of data substitution can be readily found In special education 
research (e.g., Bloom, 1967; Prehm, 1967; Halpern, Mathieu, £ Butler, 1968), 

(3) DATA DELETION : Another major attempt to form an equal-n matrix 
from an originally unequal-r^ paradigm is to use random deletion of cell 
entries. One looks at the n of the smallest cell and^accordingly "prunes 11 
all other cells down to that size. Independent runs through tables of 
random numbers are used to accomplish an. unbiased deletion in the "oversized 11 
cells. 

Closely related to the topic of random deletion- of observations is the 
systematic deletion ofhighly discrepant observations, Snedecor and Cochran 
(1967, pp. 321-323) present a very enlightening discussion on the rejection 
of extreme obset vations . Most rejection methods are based on tests of sig- 
nificance of residuals of observations from expected values. Edwards (i960,: 
pp. 166-168) also describes a method for rejection of discrepant observations 
on the basis of confidence intervals. Mainland (1968), on the other hand, 
takes opposition to all methods of rejecting observations; the reader is ad- 
vised to examine Mainland's notes before employing test-of-signif icance 
methods. For further -reading, see Anscombe (i960), Anscombe and Tudey (I963), 
Li (1964, pp. 239-240), and Searls (1963). Some interesting examples of 
data deletion in applied situations are Shubert, Jansen, & Fulton (1967) and 
Dawson (1967). 

jk) DATA CLUSTERING : In line with the philosophy of the attempts of 
data deletion and data substitution to form equal-r^ matrices out of unequal-n_ 
ones, the data clustering techniques coalesce several observations within a 
cell into fewer observations but with no loss or gain in data. The data 
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clustering techniques are without doubt the least known of u;.jqual-£ 

methods; indeed, some of the procedures to be described here have never 

been published before. 

The only data' clustering technique that has been discussed at ail is 

ANOVA where cell means become the units of analysis. In data matrices 

where all cells ha/e some entries, but cell discrepancies are such as to 

.violate the approximately-equal frequency rule, the wi thin-cel Is variation 

Is ignored. T he highest-order interaction is used as the estimate of error; 

however., the assumption must be made that the interaction is negligible. In 

effect, the ANOVA is carried out as though single replication were the case. 

The tasic mathematical defense of the method is given by Finney (I960, p. 48) 

in terms of differential coefficients of regression functions. The use of 

interactions as error terms is discussed by Edwards (1$6Q> p. 211), Ferguson 

(1966, pp. 310-3H, 31^-316), Lindquist (1953 » p. 114), and Scheffe (1959, 

pp. 247~146). An example of using the highest-order interaction as error is 

given by Ling (1968). 

A new procedure of random data clustering Was devised in late 1968 or 

k 

early 1969 by J. R. McGowan but never before published. He suggested forming 
random clusters of data within each cell of the original unequal-n^ matrix. 
The number of randomly formed clusters is the same as the number of original 
entries in the smallest cell. In this sense, the nethod might be called 
unreplicated random data clustering because some of the clusters will never 
have more than 1 observation. For example, if the smallest cell has two 
entries, then in a cell v/ith seven entries, four data wouid be randomly 
assigned to one cluster and the remaining three data in that*cell would 
become the second cluster of the cell. Clearly, the clusters in the smallest 
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cell would always contain only one score each. In the example cited, each 
cell would contain two clusters, each cluster in turn holding varying numbers 
of data. After randomly assigning within a cell all original scores to their 
new cluster "identifies 1 ', the average of each wi thin-cell cluster is computed. 
The resulting matrix of equal -frequency, mean data is subjected to a regular 
equal-frequency ANOVA with the new number of averages taken as the number of 
data. As fa,' as the authors know, McGowan was the first to put forth such a 
method. The technique seems to hold interesting possibilities. It should be 
noted that if the smallest cell has only one original observation, then the 
■■random cluster 11 method .becomes merely cell -means ANOVA (single replication), 
mentioned just above. In the present example, cell Aj Bj.is of size 13, while 
the smallest size of any cell is 10. One wants 10 clusters per cell. The 
only combination of double clusters (those with 2 scores) and single clusters 
(those with only 1 score) that yield a total of 10 clusters and still use all 
13 individual scores, is 3 doubles <md 7 singles. To determine which obser- 
vations within all Aj Bj go into which of the double and single clusters, the 
cluster numbers (labels) of 0 to 9 are assigned from a table of random numbers 
to the observations in the order that the latter are listed within the unequal-n 
data matrix. Once a digit occurs the second time, it cannot be used again. 
Further, since one wants only 3 double clusters, only 3 of the digits can be al- 
lowed to occur the second time. The averages of all double clusters are com- 
puted and, along with the single' clusters of the original observations, are 
entered into a new equal-n^ matrix upon which the classical ANOVA is finally 
computed. 

-The last method compared in this study Is an extension of the preceding 
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clustering technique and might be termed replir ted random data clustering; 
that is, no cluster will ever have fewer than 2 observations. In the present 
example where the smallest cell size is 10, one wants to generate 5 clusters in 
each cell so that at least 2 observations per cluster result. 

RESULTS AND DISCUSSION 
The summary ANOVA table for 9 unequal-n^ methods are presented in Table 3. 
While it must be remembered that the results are only an empirical comparison 



Insert Table 3 about here 



within a limited numerical example, one can draw some conclusions* First, one 
needs some basis for comparison before he can suggest that a certain unequal-n_ 
method appears to be a rather poor or good approximation to what would have 
been the results of the original equal-n^ experiment. Since the data in this 
illustration were quite carefully selected to reflect pre-specif ied differences 
and to avoid unwanted biases, the complete equal-resolution was available to 
serve as the basic "control" analysis. One can see the strength of the two 
main effects, the negligibility of the interaction, and the relatively small 
wi thin-cells variation* Because the equal-resolution would normally be un- 
available, the exact least-squares ANOVA is perhaps the most appropriate 
'•control 11 for all other unequal-n_ methods to be compared with. Even though 
the random attrition of the hypothetical example would dictate the unweighted- 
means solution, least-squares ANOVA is a better approximation. 

The most discrepant set of results occurs in connection with data sub- 
stitution by the grand"mean. Where there should have been a quite negligible 
interaction, a significant one emerged. On the other hand, substitution by 
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cell means is a quite accurate approximation of the equaJ-n, results. 

When one turns to theoretical considerations of the separate unequal-n 
methods, a number of interesting insights are yielded. First, one returns 
to the notion that unequal-n designs can be avoided by sound pre-experiment 
planning. When one considers an area such as handicapped children (special 
education), most research does not yield equal cell frequencies. It Is dif- 
ficult enough to get equal numbers of, say, educable mentally retarded Ss 
for various treatments to be compared on just the factor of treatments It- 
self, but even more difficult to get an equal distribution of sex within the 
equi-sized EMR groups under each treatment to produce a factorial design. 
Adding more-control variables usually leads to even greater fluctuations in 
cell frequencies. Thus special education researchers seem more content to 
measure differences only among -treatments in nonbacterial, one-way designs. 
When an investigator uses one-way ANOVA, valuable information on interactions 
with non-treatment variables (such as sex, age-level, level of previous func- 
tioning, class of brain damage, etc.) is lost. 

Nonetheless, proper pre-exper iment planning should not be dismissed 
lightly with regard to avoiding unequal-n. data matrices. Consider the case 
of a three-way factorial ANOVA design in which the factors are treatments, 
sex, and levels of auditory impairment. A control variable such as auditory 
impairment that lends itself to a numerical continuum often leads to unequal 
cell frequencies when the design paradigm is further subdivided by other 
control variables, such as sex. In the present example, during the planning 
stages of the experiment, auditory impairment of all potential candidates for 
participation in the study is determined. A stratification problem, inherent 
in control variables of continuous fype, is then posed. The researcher must 
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decide whether he want to form control strata on the basis of realistic 
special education criteria or on the basis of computational expediency* 
On the latter case, equal cell frequencies can be established no matter 
how artificial the cut-off points. Too often both theoretical and applied 
statisticians get side-tracked in trying to establish perfect designs and 
avoiding statistically difficult, but perhaps more meaningful and general- 
izable, situations. Of course, even if artificial stratification points 
have been chosen on the control variable- d istributions for achieving equal 
cell frequencies, experimental attrition may occur during the experimental 
period. For. further reading, see Hess, Sethi, & Balakrishnan (1966). 

However, even the best of experimental planners cannot avoid every 
pot-hole in the road of design. Consequently, statistical methods for 
handling unequal frequencies must be considered. With regard to the first 
category of unequa1-n_ methods (those dealing with unaltered data), Winer 
(1962) claims least-squares ANOVA provides more powerful tests of significance 
than unweighted -means solutions. It should be cautioned that one basic dif- 
ference between least-squares ANOVA and unweighted means ANOVA is that the 
variance relation among the total, between, and within components holds only 
for the least-squares' method. In other words, true orthogona 1 i ty of variance 
components exists only for the least-squares ANOVA. The only apparent dif- 
ference between least-squares ANOVA and unweighted means ANOVA is in obtain- 
ing a best-fit regression model based on cell means and average frequencies 
without response surface regression weighting. Basically, in a least-squares 
two-way ANOVA, one solves a set of normal equations analogous to that in 
multiple regression. As in covariance analysis, one makes adjustments to the 
raw sums of squares. He uses the exact cell, column and total frequencies 
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along with cell totals. First, one computes unadjusted row, column, and 
cell sums of squares. There are then two options: (a) ss a b( ac jj.) can be 
computed directly from means of cell means or, (b) one can go through the 
unadjusted, exact frequency analysis, computing SS b ^ gd j jand SS a ( ac jj.) by 
the abbreviated Doolittle Algorithm or, somewhat easier, by the Dwyer square- 
root algorithm, and then obtain ss a b(adj . ) by subtractlon - To use a 
physical analogy, if one pictures different thickness poker chips for dif- 
ferent magnitude scores arranged vertically one on top of the other in their 
respective cells, the least-squares ANOVA drops a response surface blanket 
over the stacks of chips naturally, taking into account different frequencies 
as well' as different sizes of scores. On th6 other hand, unweighted-means 
ANOVA does not throw the blanket down over what exists; rather, it statis- 
tically builds by leveling off the peaks and then fits a uniform unweighted 
surface on the situation, taking account only of differences in cell score 
averages. 

In dealing with least-squares solutions, an important and generally un- 
appreciated issue is that of how far the observed frequencies can deviate from 
the frequencies expected under proportionality. This question could be at- 
tacked by an application of factorial Chi-square analysis. However, since Chi- 
square is a test of poor power, its results cannot be relied upon too heavily. 
The present authors contend that least-squares and unweighted-means ANOVA are 
applied too of ten in situations where thei r mathemati cal appropriateness can- 
not be justified. This is especially unfortunate because the tests of ap- 
propriateness are themselves rather weak and under-powered. " Under expected 
equal frequencies, Snedecor and Cochran (1967) suggest that discrepancies 
in cell frequencies should lie within a 2 to 1 ratio, but only if the majority 
of cell frequencies are in closer agreement. However, this rule is given 
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without any mathematical evidence to support It. Ferguson (1966, pp. 
•319^323) provides a discussion of AN OVA from the standpoints of Tsao's 
(19^6) methods for equal and proportional expected frequencies. However; 
the reader must be aware of the possibility of bias, both positive and 
negative, in £ tests when deviations from the expected frequencies are 
large. Unfortunately, one has no completely satisfactory method of test- 
ing such deviations. Similarly, turning f rom -least-squares solutions to 
unweighted-means techniques, one worries about how much variation can be 
allpwed among the unequal-r^s relative to the original expected frequen- 
cies. The situation is compounded by the fact that one uses the harmonic 
mean of the observed cell frequencies in obtaining sums of squares, rather 
than the original frequencies. 

The second major set of unequal -n_ methods deals with substitution of 
data to obtain an equal-r^ matrix. Beginning with the grand mean method, 
one might suspect that it would produce a very poor approximation to the 
original unequal-jn matrix, or at worst", to the least-squares unequal-n 
solution. The fact that the grand mean probably is not really close to 
any specific cell means distorts the original eel* :;eans quite a bit, as 
well as increasing wi thin-cel Is variation. 

More positive things can be said about the second technique of data 
substitution: insertion of cell means for a cell's missing observations. 
First, substitution of cell means does not change the original cell mean. 
Second, and perhaps'most importantly, the method does not affect the within- 
cells variation. Finally, the technique provides a very good approximation 
to both the least-squares and equal-jn solutions. 

The data substitution method of Winer, as modified for purposes of 
this paper, makes use of both main effect means and the grand mean. The 
basic structure underlying Winer's technique is both logical and pleasing 
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in its ease of application. However, the method suffers from some severe 
limitations: (a) it assumes no interactions of any significance; (b) 
generalized, ,l halo n distortion occurs simply because of data from outside 
the cell of interest entering into the estimation; and (c) severe distortion 
occurs if the cell with missing data lies at either end of a score continu- 
um. Winer suggests that a preferable alternative would be to use a multiple 
regression equation :n connection with the response surface of the experiment 

Both the original Winer, and Snedecor-Cochran substitution methods were 
designed for cells that had no data at all in them. While Winer's*' method 
could be modified to allow any data that might be available witfvn the cell 
of interest to enter into the substituted data estimates, the method of 
Snedecor and Cochran must remain in its original form and thus could not be 
'used in the empirical comparison of this stu<Jy. 

Some final comments on data substitution are in order. !n realistic 
learning situations where it is likely that experimental mortality will oc- 
cur in a one-day study, the investigator might consider running a separate 
replication of the primary study so as to have data in reserve for substitu- 
tion purposes. It seems statistically more pleasing to substitute real data 
.than to make elaborate assumptions about the response surface. For example, 
if the desired cell size is 5, and if one cell is missing 2 observations for 
purposes unrelated to the experiment, then the corresponding data cell from 
the reserve replication would be randomly . n robbed M of 2 entries. The 
cautious researcher would then reduce the degrees of freedom for both the 
error and total sums of squares by 2. Of course, in any method of data sub- 
stitution, the degrees of freedom for the error sum of squares and the total 
sum of squares have to be adjusted accordingly; clearly, the principle of 
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diminishing returns applies, since the error mean square becomes larger 
In the process. 

Both conceptually and practically, data deletion, the third major 
category of methods for treating unequal-^ data matrices, seems qui te weak. 
The technique is suitable only when the original cell size expectation is 
large. This procedure can be extremely wasteful if cell frequencies are 
highly discrepant. A workable compromise is to find the optimum combin- 
ations of data substitution and data deletion in order to achieve the 
least amount of "synthetic" data in balance with the maximum degrees of, 
freedom. Whether or not a subject is to be discarded from analysis is 
an issue which only the ' investigator can decide. However, leaving all 
original data present and unmodified seems to be the most defensible course. 
Suppose, for example, that a normal pupil refused to cooperate on a test or 
was obviously working far below bis level. Many analysts would either 
discard this data or at least regression-modify it. Clearly, these pro- 
cedures violate reality. If normal pupils occasionally behave erratically, 
then the analysis should reflect this fact, not ignore it. 

The last group of unequal-n_ data techniques concern data clustering. 
The use of original cell means as the unit of analysis is the only familiar 
method of clustering; in other words, one has turned his unequal-n data 
matrix into an equal-n_, single replication design. There is very little in 
the literature about single replication studies where aU_ factors are fixed. 
Ferguson (I966, p. 311) discusses this situation briefly. Perhaps one could 
reason that, if the highest-order interaction of completely fixed factors 
is to be the error term, or at least part of it, then (since this "error" is 
not operating in a random fashion) it would comprise a systematic overes- 
timate. In this case, a randomly 'operating error term is treated as though 
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it Is the minimum error that could exist, and the more systematic the 
error tfc>-m f the greater the inflation. In other words, a non-negligible in- 
teraction chosen as an error term can be considered an upper Umit to the 
error. /At worst, one has conservative tests of his main'effects and other 
interactions. . . 

To what extent must the assumptions of homogeneity of variance and 
normality be met in the case where cell means are used as the units of 
analysis? There is zero variability within each cell. Normality of in- 
dividual scores cannot even be considered* Perhaps these thoughts, along 
with the robustness of the £ test, make this method of analysis one of 
the soundest of all. However, one should not assume that ANOVA by cell 
means is foolproof; Finney (i960, pp. 88-89) considers the procedure 
cppropos only when the design is "saturated" with factors, say 6 or more. 
For further discussions about violation of basic ANOVA assumptions, see 
Snedecor and Cochran (1367, pp. 278, 32^325), Scheffe (1959, pp. 36O- 

364), Edwards (i960, pp. 125-.28, 132), Box (1953),. Box (195*0, and 

S 

Lindquist (1953, pp. 72-90). 

The other two methods of data clustering (replicated and unreplicated 
random data clustering) appear pleasing at first glance because they re- 
tain all original bits of the unequal -n ^ data, do not substitute contrived 
and distorted data, and yield equal-n/s for classical ANOVA to be applied. 
Further, the replicated version seemed to offer somewhat greater reli- 
ability of individual cluster means than the unreplicated technique. In 
spite of these apparent advantages, the empirical comparison demonstrated 
that both techniques were poor approximations to the equal-rTand unequal-n 
control solutions. 
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SUMMARY 

This paper has brought together within a single perspective several 
distinct methods for handling complicated, unequal-n data matrices in 
ANOVA. A discussion of each technique's virtues and problems was pre- 
sented. Further, an empirical comparison wi thin a tightly controlled 
numerical example was undertaken among the methods. Substitution by 
cell means appeared to give the most accurate approximation to the origina 
equai-r^ solution, as well as. to the least-squares unequal-^ resul ts. How- 
ever., in the final analysis, only formal mathematical statistics can 
establish the superiority of one method over the other. It is hoped 
this paper will give impetus to mathematical * research into the relative 
theoretical properties of each technique. 

The investigators wish to conclude the review by cautioning the 
reader to be thoroughly familiar with the limitations placed upon each 
method; none of the techniques presented are "foolproof No one method 
suffices for every unequa J -ni problem the applied researcher meets from 
day to day. Some procedures have more severe'restrictlons than others. 
With some thought, the reader can devise completely new techniques, as 
well as modifications of those presented in this paper. The field of 
unequal-^ ANOVA methodology is far from being a "dead 11 research topic In 
applied statistics. It should be noted, however, that some statisticians 
would disapprove of several of the methods discussed here, if for no 
other reasons on philosophical grounds. 

In conclusion, it would be nice if the investigators could tell 
the readers to use computer programs for all their unequal cell-frequency 
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needs. This cannot be done. While several programs do exist, It must 
again be emphasized that most are appropriate only for certain situations. 
Many of the more refined programs are difficult to use, and several have 
such poor documentation of computational procedure that the user does not 
know which of the methods surveyed in this review he is: using. 
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FOOTNOTE? 

*The writing of this paper was jointly supported by Research and Information 
Services for Education (RISE) under Title III of the Elementary and Secondary 
Education A<*t of 1965 (OEG-I-67-3OIO-2696) ; by Pennsylvania Resources and In- 
formation Center for Special Education (PRiSE) , also under Title III (R-22-H, 
48-70-0003-0); and by Montgomery County Intermediate Unit -No. 23. However, the 
opinions expressed herein are solely those of the investigators and do not neces- 
sarily reflect the position or policy of the supporting agencies. BBP is respon- 
sible for the review of literature and for the conceptualization of the different 
methods of treating unequal-^ data matrices. JRMcM provided the basic idea behind 
• the data-clustering techniques, as well as valuable criticism of the^ basic thinking 
in this paper. RGT and LM also aided in conceptual criticism. Finally, PAG and 
LHC performed the empirical analyses for this study. 

2 The investigators welcome correspondence relating to this article. Address 
all comments to Dr. Barton B. Proger, Oirector of evaluation and Dissemination, 
Pennsylvania Resources and Information Center for Special Education, kk3 South 
Gulph Road, King of Prussia, Pennsylvania 19*t06 # 

3 

Some of the mathematical premises behind estimation of missing values by 
minimization of residual sum of squares have been discussed by Jaech (1 966) and 
by Sclove (1972) and have subsequently been commented upon in miscellaneous 
••letters to the editor" on pp. 57-58 in The American Statistician for October, 1972. 
k 

Alass, Peckham, and Sanders (1972) studied violation of basic AN0VA assumptions 
(non- independence of errors, non-normality, and heterogeneous variances) for both 
equal-£ matrices and unequal-ri matrices. However, the investigators in that study 
were not interested per se in different methods of treating unequal-ji data matrices. 
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